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I.  INTRODUCTION 


This  report  is  concerned  with  techniques  of  estimating  the 
parameter  values  of  mathematical  models.  It  begins  with  a  brief 
discussion  of  mathematical  models  and  how  they  relate  to  the  physical 
situations  they  attempt  to  describe.  Next,  the  method  of  least  squares 
as  it  applies  to  both  linear  and  nonlinear  models  is  investigated. 
Special  emphasis  is  given  to  nonlinear  models  with  the  linearization 
method.  Finally,  the  report  is  concluded  with  an  example  from 
climatology. 


2.  MATHEMATICAL  MODELS 


A  problem  that  arises  in  many  fields  is  that  of  determining 
what  relationships  exist  between  variables.  Suppose,  for  example 
that  an  experiment  has  produced  a  sample  of  data  consisting  of  a  number 
of  simultaneous  observations  of  a  group  of  variables.  It  would  be 
convenient  to  put  the  data  in  the  form  of  an  array  as  shown  below. 


yl 

X11 

X12  •**  xlk 

y2 

X21 

X22  X2k 

yi  Xil  Xi2  Xik 


y  x  ,  x  x  , 

n  nl  n2  nk 

Each  row  represents  one  simultaneous  observation  of  all  the  variables, 
and  each  column  represents  all  the  observed  values  of  a  particular 
variable.  One  variable  (which  one  depending  on  the  nature  of  the 
experiment)  is  designated  the  response  or  dependent  variable  (denoted 
by  y) ,  and  the  others  are  called  independent  variables  (denoted  by 
x^,  X2»  x^) .  From  the  sample  of  data,  a  researcher  would  like  to 

make  inferences  about  possible  relationships  existing  between  the 
independent  and  dependent  variables.  Mathematical  models  are  commonly 
used  to  describe  and  measure  the  strength  of  these  relationships.  They 
also  provide  a  means  of  predicting  the  value  of  the  dependent  variable 
for  any  given  setting  of  the  independent  variables. 

The  relationship  between  the  dependent  and  independent  variables 
is  rarely  a  functional  one,  where  the  behavior  of  the  dependent  variable 
can  be  predicted  exactly  from  the  independent  variables.  Experiments 
are  of  course  vulnerable  to  error,  and  the  results  may  include  effects 
from  outside  sources.  Consequently,  the  mathematical  models  under 
consideration  must  allow  for  noise  in  the  data.  Models  of  this  kind 
are  called  probabilistic  models  and  have  the  form 

y  =  f(x,9)  +  r. 


2 


where  x  =  Cx^.x^, . . . ,x^)  is  a  vector  of  independent  variables, 

Q  -  • • • »Qp)  is  a  vector  of  parameters,  and  r  is  the  residual 

or  difference  between  ihe  observed  value  y  and  the  predicted  value 
f(x,0)  of  the  dependent  variable. 

The  residual  r  is  made  up  of  two  components.  The  pure  experimental 
error  (e)  and  the  error  due  to  lack  of  fit  (e).  The  pure  experimental 
error  may  be  due  to  inaccuracies  in  the  instruments,  human  mistakes,  etc.. 
It  is  an  inherent  part  of  the  data  at  this  point  and  beyond  our  control. 
Some  assumptions  are  usually  made  about  c:  that  it  is  random  in  nature 
and  has  expectation  zero.  If  this  is  so,  then  the  data  has  no  systematic 
source  of  error  that  would  tend  to  skew  the  results  in  one  direction  or 
another.  The  error  due  to  lack  of  fit  is  the  error  in  the  model,  and  it 
too  may  arise  from  a  number  of  sources.  Perhaps  not  all  the  significant 
variables  are  accounted  for,  or  perhaps  the  relationship  expressed  by 
the  model  is  not  an  appropriate  one  to  use. 

With  these  considerations,  the  model  may  be  rewritten 

y  ■  f (x, 0)  +  e  +  e  where  r  =  e  +  e . 

The  function  f  and  the  computed  values  of  the  parameters  are  ideally  those 
that  minimize  e  and  average  out  t. 

Once  a  particular  function  f  is  chosen  for  the  model,  the  next 
problem  is  to  compute  the  parameter  values.  The  parameter  values  are 
those  that  best  describe  the  particular  sample  of  data  available,  and 
are  used  as  estimates  of  the  parameter  values  that  best  fit  the  larger 
population.  The  methods  of  computing  the  parameter  values  depend  on 
the  for  of  f;  specifically,  whether  f  is  linear  or  nonlinear  with 
respect  to  its  parameters.  Linear  models  have  the  form 

y  =  0Q  +  0^  +  02x2  +  ...  +  0^  +  r, 

and  models  not  of  this  form,  with  respect  to  the  parameters,  are  nonlinear. 

Some  examples  of  linear  and  nonlinear  models  are  shown  below. 

Some  Linear  Models 

2  3 

y  =  0q  +  0^x  +  @2x  +  0^x  +  r  (cubic  polynomial,  one  independent  variable) 

y  =  0q  sin  x^  +  02*2  +  ®3X2  s^n  x  +  r  (two  independent  variables). 

Some  Nonlinear  Models 
92  “ 

y  =  1  -  exp(-0jX  }  +  r  (used  in  modeling  visibility,  see  reference  f 5 1  ) 

y  =  1  -  (1  -  x^)^2  +  r  (used  in  modeling  skycover,  see  reference  [61  ). 
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3.  METHOD  OF  LEAST  SQUARES  IN  THE  LINEAR  CASE 


Suppose  that 

the  set 

of  data 

yl 

X11 

x12  ‘ 

••Xlk 

y2 

X21 

x22  * 

•  *X2k 

yi 

yil 

X 

12 

"Xik 

yn 

ynl 

Xn2 

Xnk 

and  a  linear 

model, 

Y  =  eo 

+  er 

chosen  to  express  a  relationship  between  the  dependent  variable  y  and 
the  independent  variables  x^. 

Regardless  of  what  values  may  be  arbitrarily  assigned  to  the 
parameters  9^,  9^,...,©^,  for  each  observation  in  the  data  set,  there 

exists  an  rj  so  that 


yi  =  90  +  Vil  +  92Xi2  +  +  Vik  +  ri  * 

This  equation  is  called  an  observation  equation.  Solving  for  r^  gives 


yi  -  90  *  Vil  "  02Xi2 


6,  x 
k  ik 


Intuitively,  one  would  like  to  find  the  values  for  9.,  0,  that 

0  1  k 

minimize  ! r^ |  for  each  observation  in  the  data  set.  The  method  of  least 

squares  finds  those  parameter  values  that  minimize  instead  the  sum  of  the 
squared  residuals  over  all  the  observations.  That  is,  those  parameter 
values  that  minimize 


(yi  ~  eo 


Vil  ~  Vi2 


9,  x.,  ) 
k  ik 


are  found. 
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Observe  that  RSS  is  a  function  of  the  parameters  9. ,6, , . . . ,0  : 

0  1  k 

y i>  x^,  Xi2’  ***'  xik  a H  being  known  values  (the  data).  To  minimize 

RSS,  its  partial  derivative  with  respect  to  each  Pj  is  taken  and  set 
equal  to  zero.  The  resulting  system  of  k  +  1  linear  equations  in  k  +  1 
unknowns  (often  called  the  normal  equations)  has  as  its  solution 
the  vector  of  parameter  values  that  minimize  RSS. 


39 


The  partial  derivatives  are  shown  below: 

n 

2 

i=l 


9RSS  =  -  2  2  (y.  -  0-  -  0,x..  -  0„x . .  -  .. 

-  0  1  ll  2  i2 


9.X.,) 
k  lk 


3RSS 


n 


39  ~  2  .l  (yi  90  “  9lXil  92Xi2 

1  1=1 


9kXik)xil 


3  RSS  _  _  _  .  Q  Q  Q 

30  ~  2  X  ~  ~  ®ix-m  ~  ®ox 


i=l 


i  0  1  il  2  12 


.  -  9.  x  )x 
k  ik  ij 


3RSS  =  -  2  2  (y.  -  9n  -  Q,x.~  Q„x 

l  0  1  il  2 


30,  i=l 


12  "  ‘ 


.  -  0.  x  )  x . 
k  ik  ik 


Setting  them  equal  to  zero  and  simplifying  them  algebraically  yields 
the  system  of  normal  equations: 


V  +  91  Xxil  +  e2  2xi2-’-9k  xik  = 

2 

0  2  x  +9.2  x  +9-1  x.-x  +  ...  +  9  ?'  x..x  =  Z  y.x  . 

0  il  1  il  2  i2  il  k  ik  il  Ji  il 


Q.  L  x .  +  0  1  x  x  +92  x,-x,,  +  ...  +  9,  E  x,.x.  .  =  E  y.x.. 

0  ij  1  il  ii  2  i2  ij  k  ik  ij  i  ij 


0Q  E  xik  +  0T  2  xuxik  +  02  2  x12x.k  + 


+  9k  X  xik  =  1  yiXik 
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The  equations  may  be  further  simplified  by  using  MATRIX  notation. 
Let  Y  be  the  column  vector  of  values  of  the  dependent  variable: 


,  and  let  X  be  the  matrix  of  values  of  the  independent 


variables,  appended  on  theleft  with  a  column  vector  of  l's: 


1  X11  X12  **•  Xlk 


1  X21  X22  * ’ *  X2k 


1  Xil  X12  Xik 


lx  x  ...  x 
nl  n2  nk 


Lastly,  define  9  to  be  the  column  vector 


of  parameters  in  the  model  and  r  to  be  the  column  vector  of  residuals: 


v 

~  i*- 

rl 

«2 

r2 

0  = 

0. 

r  = 

r , 

j 

i 

9, 

r 

k 

n 

•  * 

equations 

y. 

=  +  0,x 

+ 

0  1 

11 

y2 

-  e0  +  exx 

21  + 

y , 

=>  9_  +  0,x 

0  1 

•  -l 

il 

y 

»  +  9,x 

,  + 

0  1 

nl 

Then  the  system  of  n  observation 


Vlk  ‘l 
3kX2k  +  r2 


k  ik  i 


J,  A  .  i  L 

k  nk  n 
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may  be  written  simply  as 

Y  =  X  0  +  r 

Furthermore,  the  n 'ri-ai  equations  may  be  rewritten  in  terms  of  matrix 


mu  1 1 ip licat ion : 


rn 

);xii 

hxi2 

•  * ‘  Kxik  1 

( 

1  = 

I 

* 

7yi 

TXil 

Tx  2 

Xil 

Y. 

x.„x 
i2  il 

...  J;XilXik 

:  ! 

:  :  0. 

i  i  l 

i 

1 

r.y.xn 

"Xi  2 

IXilXi2 

X  x .  n 

i2 

...  >:xi2x.k 

„2 

,  - 

:  . 

'  V  >  X f 

Ex.  . 
J-J 

lx.  ,  x.  . 

il  1.1 

Tx  .  ,.x  .  . 

...  Xx.  .x., 
lj  i  k 

i  ■ 

!  *.i 

1 

i 

l 

XviXij 

;:xik 

Ex.  x 
ll  lk 

L’x .  0x 
i2  lk 

V„  2 

...  “'ik 

L. 

ZyiXik_ 

Noting  that  the  matrix  on  the  left  is  X.  rX  and  the  matrix  on  the  right 
is  X r Y ,  the  entire  system  of  normal  equations  may  be  expressed  by  the 
single  matrix  equation 

(XT  X)0  =  (XT  Y)  . 

Finally,  solving  for  0  gives 

0  =  (XT  X)-1  (XT  Y)  . 

This  is  a  very  powerful  result,  since  it  holds  for  any  linear  model 
and  corresponding  set  of  n  -  k  +  1  observations  (where  n  is  the  number  of 
observations  and  k  +  1  is  the  number  of  parameters;  that  is,  there  should 
be  at  least  as  many  observations  in  the  data  set  as  parameters  in  the 
model).  The  first  step  would  be  to  construct  the  matrices  X  and  Y^  from 
the  data.  Then  the  parameter  values  that  minimize  RSS  are  those  that 
satisfy  the  linear  system 

rT  x)  0  =(xt  y)  . 

Most  computer  installations  have  routines  for  solving  systems  of  linear 
equations.  As  a  warning,  however,  this  particular  system  is  often  ill- 
conditioned,  so  round-off  error  is  an  important  factor  in  the  computat ions . 
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4.  METHOD  OF  LEAST  SQUARES  IN  THE  NONLINEAR  CASE 


Suppose  that 

the 

sample  of  data 

yl 

X11 

X12  Xlk 

y2 

X21 

X22  *  ‘  ‘  X2k 

yi 

Xil 

Xi2  *  *  *  Xik 

yn 

Xnl 

Xn2  Xnk 

is  given  and  a  nonlinear  model,  y  =  f (ac ,0)  +  r,  is  chosen  to  express 
a  relationship  between  the  dependent  variable  y  and  the  independent 
variables  ,  X2,  x^. 

Proceeding  in  much  the  same  way  as  for  the  linear  model,  regardless 
of  what  values  may  be  arbitrarily  assigned  to  the  parameters,  for  each 
observation  in  the  data  set  there  exists  an  r^  so  that 

yi  =  f^-i’  +  ri  • 


Solving  for  r  gives 


r  =  y  -  f(x, ,6). 
1  i  — 1  — 


The  method  of  least  squares  minimizes  the  quantity 


RSS 


E  (y  -  f (X  ,0))2 

i  =  l  1 


with  respect  to  its  parameters  by  setting  the  partial  derivatives  equal 
to  zero  and  solving  the  resulting  system  of  equations.  The  partial 
derivatives  are  of  the  form  shown  below: 


a  RSS 

36. 

J 


n 

-2  T. 
i=l 


(yi 


3f (x. ,6) 
- 1  — 

36  j 


for  j  =  1,  . . . ,p. 
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Therefore,  the  system  of  normal  equations  is 


n 

Z 

i=l 


(*i 


f  (xt  ,  G)  )  3f(3c1,0  ) 

ae. 


0  for  j  =  1,  . . .  ,P 


This  approach  may  be  fine  for  some  nonlinear  models,  but  for  others 
difficulties  arise.  The  model  may  be  so  complex  that  the  partial  derivatives 
are  difficult  to  obtain.  Furthermore,  since  the  model  is  nonlinear  the 
partial  derivatives  and  normal  equations  are  also  nonlinear.  Systems  of 
nonlinear  equations  can  be  difficult  to  solve,  and  iterative  techniques 
are  almost  always  required.  Also,  more  than  one  solution  to  the  system 
may  exist,  corresponding  to  the  critical  values  of  RSS.  In  this  case, 
each  solution  must  be  tested  to  determine  which  produces  the  minimum 
value  of  RSS.  For  these  reasons,  other  methods  have  been  devised  for 
estimating  the  parameter  values  of  a  nonlinear  model. 
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5.  THE  LINEARIZATION  METHOD 


The  linearization  method  of  nonlinear  regression  has  three  distinguishing 
features.  First,  it  is  an  iterative  procedure.  Given  an  initial  estimate 
of  the  parameter  values,  a  "better"  estimate  is  computed  with  each  successive 
iteration.  fhe  parameter  values  are  refined  in  this  manner  an  unknown  number 
of  times  until  finally  some  stopping  criterion  is  met.  Second,  it  is  not 
self-starting.  This  means  that  the  method  does  not  provide  a  way  of  obtaining 
that  initial  estimate  of  the  parameter  values.  The  researcher  must  use 
whatever  information  is  at  hand  to  make  the  first  guess.  Lastly,  linear 
approximations  of  the  nonlinear  model  are  used  to  compute  the  parameter 
values,  so  the  techniques  already  developed  for  linear  models  are  applicable. 

Let  =  (t-pt2» . . .  »tp)  be  the  current  best  estimate  of  9  =  ’ ' -  '  *®p)  * 

To  "linearize"  the  nonlinear  model 
y  =  f(x,0)  +  r. 


a  first  order  Taylor  series  expansion  of  f  about  T  is  made.  The  expansion 
is  a  linear  polynomial  in  (9.  -  t.)  of  the  form 

J  J 

P(x,0)  -  f  (x,T)  +  E  («.  -  t.)  . 

j-1  a9J  3  J 

For  0  close  tc  T,  P  (x,0)  =  f (x,0) .  Therefore,  substituting  P  for  f  gives 
a  linear  approximation  of  the  nonlinear  model. 


y  =  P(x,0)  +  r 

, ,  .  .  v  9f (x.T) 

y  =  f (x,T)  +  Z  - - 

j=l  3  0j 


(0.  -  t.)  +  r 
J  J 


f(x,T)  =  3f(-,-)(0  -  t  )  +  3f(-,-)  (0  -  t  l+...+3f(-’-)  (9  -  t  )  +  r  . 

TO  i  i  i.  Z  ElO  P  P 


In  terms  of  the  linear  approximation,  the  system  of  observation  equations  is 


y±  -  f (x. ,T)  = 


9  f  (  x . ,  T ) 


(9  -t  )+3f(-i’-)(0  -t?)+..,+  3f(jii,-)  (0  -t  )  +  r. 
11  ^  o  22  PP  l 


30  P  P 

P 


for  i  =  L,2,...,n.  Note  that  y  ,  x, ,T,  f(x.,T),  and  3f ±’— ^  are  an 

"j 

known  values  or  values  that  can  be  computed  directly.  The  only  unknowns  are 
the  9.  in  (0^  -  t^).  Letting 

W,  ,  =  j’— ^  ,  z,  =  y.-f(x.,T),  6.  =  9,  -  t,,  and  the  matrices 

iJ  ao -  1  i  -i  ~  j  j 
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I 


w  = 


wu 

W,2  • 

•  wi 

ip 

Z. 

y  i  -  f  ( x  j ,  t  ) 

1 

yr'i 

W21 

W22  • 

•  Wo 

2p 

z2 

y.,-f  (x^.T) 

"  2 

02-t, 

• 

• 

»  *  '  _ 

• 

- 

*  '  ' 

Wnl 

W  ,  . 
n2 

.  W 

np 

7. 

n 

y  -f(x  , T ) 
n  U 

P 

0  -t 
p  p 

■n 

• 

• 

„ 

0  = 


* 

0, 

- 

t  , 

i 

1 

0^ 

L„ 

2 

2 

, 

,  1 

* 

,  and  r  - 

0 

t 

P 

P 

then  the  system  of  observation  equations  may  be  written 
Z  =  W  6  +  r  . 

Applying  linear  least  squares  theory,  the  solution  for  ^  that  minimizes 

(0.-t.)  1 


n  r 

rss.  .  =  ;  y.-f (x. , t 

I  in  .  ,  Ji  -i  ;  =  t. 

i  =  l 


;>6 . 


j  ) 


is  given  by 

_6  =  (WTW)_1  ( W  T  Z )  , 

and  the  new  estimate  of  0  is 
0  =  o  +  T  . 


<5  may  be  thought  of  as  being  a  correction  vector,  so  that  when  added  to 
the  old  estimate  of  the  parameter  values,  a  newer,  "better"  estimate  is 
obtained.  The  method  now  calls  for  substituting  the  new  estimate  of  the 
parameter  values  in  for  T,  and  repeating  the  procedure.  The  following 
flowchart  outlines  the  algorithm. 
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Set  H  equal  to  the 
starting  values  ot  the 
jiarnnu'tors 


calculate  W,  Z 


* 


s  =  (wrw)"1(WTz) 


F57 

I 


Obviously  some  test  must  be  mode  to  determine  when  to  halt  i  he 
procedure ,  as  Indicated  in  the  flowchart  by  the  test  tor  convergence. 

Any  one  of  several  tests  mav  be  used.  Intuit  Ivelv  one  would  halt  the 
procedure  when  the  estimates  ol  the  parameter  values  approach  a  limiting 
value;  i.e.,  when  the  difference  between  two  consecutive  estimates  is 
sm  i  L 1 .  it  A  is  any  matrix,  let  |-\|  denote  the  largest  absolute  value  of 
the  elements  of  A.  Then  one  possible  test  for  convergence  is  the  relation 


where  c^  is  some  prescribed  constant  value  and  the  division  is  an 

element-wise  division  between  the  matrices  and  T.  This  relation  tests 
whether  the  largest  proportional  change  in  the  estimates  is  less  than  i  . 

It  is  most  appropriate  when  the  relative  magnitudes  of  the  parameters 
are  unknown.  typical  values  'or  c ^  are  in  the  range  lb- 5  to  lu~b 

Another  test  tor  < onvergem e  might  be  performed  by  comparing  the 
sun  of  the  squared  residuals  lor  t wo  consecutive  estimates  ('I  the 
parameters.  1  he  sum  ot  the  squared  residuals  ol  the  nonlinear  model, 
given  hv 

n 

KSS(H)  -  1  v  -  f  (x,  ,«) 

i  =  !  *  1 

is  a  function  >!  ts  and  is  a  measure  ol  the  goodness  ot  fit  ot  the  model, 
keo.tl!  that  in  the  algorithm  the  matt  ices  W  and  I  represent  two  consecutive 
estimates  ot  tin-  parameter  values.  Hence,  convergence  mav  he  tested  by 
t  he  re  lit  ion. 

Fsstm  k:;s  m  > 

kss  n  ) 

where  i  ,  is  some  i. instant.  Ibis  is  appealing  in  the  sense  that  it  directly 
tests  tot  a  dll  let  etti  e  in  tlu-  tit  between  the  two  mode’s.  It  is  con¬ 
ceivable  tint  a  wide  range  ot  values  lor  the  parameters  might  g i ve  approx¬ 
imately  t  in-  same  lit,  in  which  case  this  test  would  assume  convergence 
b>  fore  the  other.  However,  It  the  only  purpose  in  estimating  the  parameter 
values  is  to  obtain  a  good  fit,  then  the  test  is  appropriate. 

Lastly,  the  import  ante  of  good  starting  values  should  be  stressed. 
Convergence  In  the  1 i near f /at  ion  method  is  not  guaranteed;  its  success, 
failure  and  speed  may  well  depend  on  how  good  the  starting  values  are 


the  algorithm.)  As  mentioned  earlier,  the  researcher  is  left  up  to 
his/her  ovm  ingenuity  in  finding  starting  values,  making  use  of  whatever 
information  is  available.  One  suggestion  is  to  plot  a  grid  of  the  RSS 
computed  at  various  parameter  values  to  get  an  idea  of  where  minimums 
might  occur.  For  difficult  problems,  more  complicated  variations  of  the 
linearization  method  may  be  employed.  Also,  the  NL1N  procedure  of  the 
Statistical  Analysis  System  (SAS)  is  a  software  package  that  performs 
an  extensive  analysis  of  the  nonlinear  regression  problem. 


I 


r 


AN  FXAMPI.F.  FROM  CUMATOLOCY 


Clinuitologic.il  data  is  col  I  it  tod  on  a  daily  basis  at  several 
locations  around  the  world.  The  data  includes  observations  ol  vaiious 
surface  weather  conditions  such  as  rainfall,  skycover,  ceiling,  and 
visibility.  The  visibility  observations  recorded  over  the  years  art- 
grouped  so  that  for  a  given  location,  month,  and  hour  period  there  are 
15  data  points  of  the  form  (x,y),  where  x  is  distance  in  statute  miles 
and  y  is  the  proportion  of  the  rime  in  the  past  that  visibility  has  been 
less  than  or  equal  to  x.  The  following  data  comes  I rom  Co.se,  New) ound- 
land  in  February  during  the  hours  of  0900  to  1100: 


x _ _  l 

\  _ 

.25 

.001 

.1125 

.015 

.  5 

.01  7 

.625 

.  02  1 

.75 

.025 

1.00 

.018 

1.25 

.065 

1.5 

.071 

2.00 

.095 

2.5 

.126 

T 

.  1  14 

4 

.  1  59 

5 

.  188 

.209 

,»  | 

.261 

n  location. 

mont  i 

Problem :  For  a  given  location,  monte,  and  hour  period,  build  a  probability 

model  relating  distance  (x)  to  the  proportion  of  the  time  that  visibility 
has  been  less  than  or  equal  to  x  (y). 


The  Weibull  cumulative  frequency  distribution  function  lias  been  used 
to  model  the  visibility  data  for  several  of  these  locations.  In  t lie 
mode  1 

-8  x0' 

y  -=  f  (x,0)  +  r  =  1-e  1  +  r, 

the  vector  x  ot  independent  variables  consists  only  of  x,  and  c,  t  he 
vector  of  parameters,  f  s  (0,0,,).  The  model  is  obviously  nonlinear  in 
its  parameters,  so  the  linearization  method  Is  used  to  compute  the 
parameter  values. 

The  first  problem  Is  that  of  finding  starting  values  for  tin- 
parameters.  These  may  be  obtained  by  taking  two  observations  from  the 


/ 


data  set,  substituting  them  into  the  formula 

,  -e,x9- 

y  -  1  -  e  , 

and  solving  for  0^  and  02»  which  is  equivalent  to  fitting  the  curve 
through  the  two  data  points  exactly.  If  (x^,y^)  and  (x^,y^)  are  the 
two  observations  chosen,  then  we  have  the  system 


i  "®l  x< 

y{  *  1  -  e  i 


02 


Vj  *  1  -  e 


-01  X 


which,  after  some  manipulation,  yields 


Using  (1.(10,  0.038)  and  (3.00,  0.188)  from  the  data  shown  earlier,  the 
starting  values  are 

0(  =  .0)87408  and  &2  *  1.045 


The  next  problem  Is  to  "linearize"  the  model. 

T  =  (t,,  t,,...,  t  )  is  the  current  best  estimate  of 

1  1  p 

then  tin-  nonlinear  model  is  approximated  by 


Recall  that  if 

0  =  (0, ,0  .. ,0  ), 
12  p 


if (x.T) 


•f (x,T) 


if (x.T) 


y  -  t  (  x  ,  T  )  = 


t (x,0) 


>0  ‘W  +  30  (92  t2)  +  +  »© 

1  2  p 


-0i  x 


0 


,  and  the  partial  derivatives  are 


(0  -t  )  +  r 
P  P 


•  f  ( x ,  T  ) 
'■®1 


t 

t  - 1 1  x  '  . 

x  •  e  and 


<f (X.T) 

•V" 


1 1  •  tn(x)  •  x  ' 


-t,  x  ‘ 


.o  tiie  linear  approximation  is 


t  t  t 

y  -  1 -e  1 1  X  i  «  I  XC '  •  e  1  X  I  (0, -t , )+l  l , • «n(x) *x  -e  1 X  I  (0,,-t ,,)  +  r 


i  r  i 


2  2 


lb 


From  this  the  matrices  Z  and  W  may  be  identified: 


I  ,  -t , x. t2  , 

y  j  —  L 1  -  e  11  ] 

t,,  -t.x,  2 

2  •  e  11  tx 

-t  x 

y2-ll  -  e  C1X2  1 I 

t  -t  x  fc2 

*2  2  *  e  12  t^ 

Cl  “  t  |  Xrj  2  -1 

y  -LI  -  e  1  11  J 

,  w  = 

x  t2  •  -t.x  C2  t. 

Jn 

n  e  1  n  1 

•  • 

• 

tn(x^)*  e  tlXl  ^ 


{.n(x^)*  x^Z  *  e  tLX2  ^ 


.  t„  -t, xn  2 
x  )  •  x  2  •  e  1 


£n(x  ) 

n  n 


Now  that  starting  values  have  been  found  and  we  know  how  to  compute  and 
W,  we  merely  follow  the  algorithm  expressed  in  the  flowchart.  The  visibility 
data  from  Goose,  Newfoundland  converged  after  7  iterations.  Convergence  was 
assumed  when  the  proportional  change  in  the  RSS  for  two  consecutive  estimates 
was  less  than  1  x  10_®  .  Below  are  the  intermediate  and  final  estima.^s, 
computed  by  a  program  written  in  PROC  MATRIX  of  the  Statistical  Analysis  System. 
Double  precision  arithmetic  was  used. 


Iteration 

91 

92 

RSS 

0  (Starting  values) 

.0387408 

1.045 

.010229 

1 

.0560605 

. 734919 

.00336779 

2 

.0546557 

.787792 

.00261866 

3 

.0553048 

.77971 

.00261354 

4 

.055203 

.780875 

.00261344 

5 

.055218 

.730708 

.00261344 

6 

.0552159 

.780732 

.00261344 

7 

.0552162 

.780729 

.00261344 
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